docs(secrets): scoped-store v1 spec — secrets.<scope>.yaml with per-scope keys#358
docs(secrets): scoped-store v1 spec — secrets.<scope>.yaml with per-scope keys#358Cre-eD wants to merge 12 commits into
Conversation
…aml) Concrete spec for the RFC's Minimal v1: SOPS file-per-scope (.sc/stacks/<stack>/secrets.<scope>.yaml), age recipients governed by a CODEOWNERS-gated .sc/scopes.yaml, scope-aware CLI verbs, deterministic deploy-time resolution with hard-fail-on-missing, plaintext-leak lint, strict mode-A backward compatibility (additive files old binaries never open), CI demotion plan (pull_request jobs get a pr-scope key instead of the master SC_CONFIG). Implementation lands in this same PR after design sign-off. Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
Semgrep Scan ResultsRepository:
Scanned at 2026-07-04 12:17 UTC |
Security Scan ResultsRepository:
Scanned at 2026-07-04 12:18 UTC |
📊 Statement coverageMeasured on the documented included set (see
Baseline: |
Panel decision (Codex + Gemini + 2 Claude lenses): - D1 scan-only: v1 scopes the 4 PR scan/lint jobs; deploy-shaped PR jobs and crossguard's Pulumi creds stay out (crossguard -> OIDC, never the pr scope). - D2 SC_KEY_PR interim + committed KMS/OIDC v2 (reuses existing KMS+OIDC infra; recipient swap on the same files, stored key retired at v2). - D3 scope files in the devops parent (integrail) store; resolver still supports consumer-repo scopes for the later deploy sweep. Bakes in the 7 consensus P0s: parent/child merge constraint, real hard-fail (not swallowed warn), scope-name + path:scope:key AAD binding, sc recipient-verify lint, preview-deploy exclusion, secretScope PR-clamp, SC_KEY_PR blast-radius bound. Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
Implements the v1 scoped-secret store on sc's OWN cipher layer (RSA-OAEP + X25519 sealed box) rather than pulling in SOPS/filippo.io-age — neither is a current dependency and adding them would be a large supply-chain surface. Value confidentiality, per-recipient sealing, and AEAD/OAEP associated-data binding are all provided by the existing ciphers package. Recipients are SSH public keys (ssh-ed25519 / ssh-rsa), consistent with the whole-file store. - ciphers: add EncryptLargeStringWithAAD / DecryptLargeStringWithAAD / DecryptLargeStringWithEd25519AAD. A nil AAD reproduces the exact legacy wire format (RSA OAEP label / X25519 associated data), so the whole-file store is byte-for-byte unchanged and all existing cipher tests pass. - scoped: scopes.yaml governance model (per-scope recipient sets, allow/disallow, fail-closed schemaVersion guard) + secrets.<scope>.yaml file format (committed-encrypted, structure readable / values opaque), per-recipient sealing keyed by SHA256 SSH fingerprint, set/get/delete, and offline VerifyConsistency for lint. - Security bindings (P0-3): each value's AAD is domain-separated scope\x00key, so a ciphertext cannot be transplanted to another scope or key; LoadScopeFile rejects a file whose in-name scope != filename scope (rename attack). - Tests: RSA + ed25519 round-trip (incl multi-chunk), multi-recipient, non- recipient rejection, scope+key transplant resistance, rename detection, fail-closed version guard, consistency drift, governance, name validation. Whole-file store, CLI wiring, and deploy-time resolution are unchanged in this commit and follow next. Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
The RFC named SOPS, but the implementation reuses sc's own ciphers (no getsops/sops or filippo.io/age dependency). Update the spec's crypto/format layer, recipient type (SSH keys, not native age), integrity model (AEAD/OAEP scope\x00key binding + filename/scope check, not SOPS MAC), lint gates, and compatibility section to match what landed. Status -> IN PROGRESS. Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
…ow/lint/doctor) Adds the `sc secrets scope` command group over the scoped store, namespaced so it never collides with the whole-file store's existing allow/disallow/add/etc. - set (value arg or stdin), get (ambient key), list, delete - allow/disallow: update .sc/scopes.yaml then reseal every scope file of that scope to the new recipient set (Reencrypt: decrypt-with-current-key then re-seal, all-or-nothing); disallow prints the mandatory rotate-values warning since removal does not rewrite git history - lint: per-file VerifyConsistency + recipient set == scopes.yaml (drift fails) + scope/filename binding — the CI gate - doctor: which scopes the ambient key can open - set refuses to write into a file whose recipients have drifted from scopes.yaml (reconcile via allow/disallow first), so a value is never sealed to a stale set - package: ScopeFile.Reencrypt, path helpers (ScopesPath/StackDir/ScopeFilePath/ ListScopeFiles), exported Fingerprint/SameRecipients; Save now MkdirAll's parents Verified end-to-end on a real repo: allow->set(arg+stdin)->list->get(correct plaintext)->lint OK->doctor YES; committed file is encrypted with readable structure and zero plaintext leak. Package tests: 20 green. Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
…ini) No P0s — both reviews confirmed the crypto core (transplant resistance, backward-compat, per-file atomic reseal) is sound. Fixes for the CLI/lint findings: - [Codex P1] scope get/doctor now resolve the decrypt key from --key-file, then SC_KEY_<SCOPE> / SC_SCOPE_KEY env, then the ambient config — so a pull_request scan job can hold ONLY SC_KEY_PR without a full SIMPLE_CONTAINER_CONFIG. Verified: get returns the value with only SC_KEY_PR set. - [Gemini P1] reconcileRecipients (allow/disallow) is now two-phase: reseal every affected file in memory first, persist only after all succeed — a mid-way decrypt/parse failure can no longer leave files/scopes.yaml drifted. - [Gemini P2] the private key is fetched lazily — declaring the first recipient of an empty scope needs no ambient key. - [Codex P2] VerifyConsistency (lint) now base64-decodes every chunk and requires >= AEAD-tag length, rejecting a plaintext value smuggled under a valid recipient fingerprint — the offline plaintext-leak gate now has teeth. - [Codex P2] Scopes.Allow rejects unsupported recipient key types (ECDSA, certs) that fingerprint but cannot be sealed to, failing fast at governance time. Tests: 22 green (added plaintext-chunk + unsupported-key cases). Re-verified end-to-end: allow->set->second-allow(reseal)->get-via-SC_KEY_PR->lint->ecdsa-reject. Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
Wires scoped secrets into the deploy read path (provisioner
readSecretsDescriptorFromFile): after loading a stack's whole-file values, merge
in every secrets.<scope>.yaml the AMBIENT KEY is a recipient of.
Design improvement over the RFC's secretScope: field — resolution is key-driven,
not config-driven, which DROPS P0-6 (PR-manipulable scope selection). A job
holding only SC_KEY_PR is a recipient of the pr scope alone, so it cannot decrypt
secrets.prod.yaml — the pull_request clamp is cryptographic, with no config to
subvert.
scoped.ResolveScopedValues semantics:
- no key / no scope files -> empty, no error (repos without scopes unaffected;
the key is not even parsed). Verified: provisioner tests unchanged.
- scope files the key can't open are skipped (least privilege).
- whole-file store wins on conflict; scoped values only ADD new keys, so an
existing ${secret:} resolution can never change.
- HARD-FAIL (P0-2) on: a value the key IS a recipient of but can't decrypt
(tamper), a corrupt/renamed scope file, or a key present in two openable scopes
(ambiguous). Not being a recipient is not an error.
secret-get and ${secret:} both see merged scoped values transparently. Spec
updated to the key-driven model. Tests: key-determines-scope (A sees pr not prod,
B sees prod not pr, non-recipient sees nothing) + cross-scope-dup fail; scoped +
provisioner suites green; whole module builds.
Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
- [P1] Resolve scoped-only stacks: readSecretsDescriptor no longer bails on a
missing legacy secrets.yaml before scope resolution — a stack with only
secrets.<scope>.yaml now resolves (empty descriptor + scoped merge). If neither
a legacy file nor any openable scoped value exists, it still returns the
ignorable not-found (behavior preserved for secret-less stacks).
- [P1] Deploy path honors CI scope keys: ResolveScopedValues now takes multiple
candidate keys, and the provisioner gathers the ambient config key PLUS
SC_SCOPE_KEY and any SC_KEY_<SCOPE> (e.g. SC_KEY_PR) from the env — so a
pull_request job holding only its scope key resolves scoped secrets via
${secret:} and sc stack secret-get, matching the CLI's key resolution.
- [P2] Integrity hard-fails are never swallowed: scoped resolver failures (tamper,
corrupt/renamed file, ambiguous cross-scope key) are tagged scoped.ErrScopedIntegrity
and ReadStacks propagates them even under IgnoreSecretsMissing — so a deploy can
no longer proceed past a broken scope file just because the affected secret
wasn't referenced. Not being a recipient remains a silent skip (least privilege).
Tests: added multi-candidate-key resolution + ErrScopedIntegrity assertions;
scoped + provisioner suites green; whole module builds.
Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
Verifies the three Codex-review fixes end-to-end at the provisioner: - scoped-only stack (no legacy secrets.yaml) resolves via a CI scope key from the environment (SC_SCOPE_KEY, no SIMPLE_CONTAINER_CONFIG) — P1a + P1b; - a corrupt scope file surfaces as scoped.ErrScopedIntegrity even when unreferenced — P2 (never swallowed); - a stack with neither legacy nor openable scoped secrets still reports plain not-found (os.ErrNotExist), preserving IgnoreSecretsMissing behavior. Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
…lleability) + P1s Full multi-model panel (4 Claude lenses + Codex + Gemini). Two real P0s, both fixed with acceptance tests; the panel's third P0 (ed25519 whole-file format changed) was a false positive — main already used X25519 for ed25519, so the nil-AAD path is byte-identical (now proven by TestAAD_NilRoundTrip_LegacyCompatible). P0-A ed25519 downgrade attack (ciphers/encryption.go): DecryptLargeStringWithEd25519AAD fell back to the legacy decryptWithEd25519 for non-X25519 blobs, which ignores AAD and derives its key from public data — an attacker could forge a legacy blob with a chosen plaintext, drop it in a scope file under a real recipient's fingerprint, and the deploy would decrypt it, bypassing the scope/key binding. Now: refuse legacy blobs whenever an AAD is supplied (the aad-less whole-file/migration path is unchanged). P0-B RSA multi-chunk malleability (ciphers/encryption.go): every OAEP chunk used the same label, so a hand-edited scope file could reorder/drop/splice chunks of an RSA recipient's value and still decrypt to a permuted/truncated plaintext (lint didn't catch it). Now: the OAEP label binds chunk index + count (chunkLabel), gated on aad != nil so the legacy store stays byte-identical; empty-chunk-under-AAD is rejected. P1s: Disallow no longer filters the recipient slice in place (backing-array aliasing); scope/scopes files are written atomically (temp + rename, no torn writes that would hard-fail deploys); disallowing the last recipient is refused with a clear error; the provisioner's SC_KEY_* env scan is constrained to SC_KEY_<valid-scope> so an unrelated env var is not tried as a decryption key. Tests: TestAAD_MismatchFails (the core binding invariant, RSA + ed25519), TestRSAChunkFraming_ReorderTruncateSpliceFails, TestEd25519Downgrade_RejectedUnderAAD, TestAAD_NilRoundTrip_LegacyCompatible. All secrets + provisioner suites green. Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
… dup, gate, docs, tests) Closes the should-fix items from the full review panel. - Cross-stack transplant binding (Gemini P1): scope files now carry a Stack field, bound into every value's AAD (stack\0scope\0key) and verified against the parent directory on load. A ciphertext cannot be copied between stacks, scopes, or keys. - Stronger offline gate (Security/Coherence/Codex P1/P2): VerifyConsistency checks each chunk against the recipient's key type via ciphers.ValidateCiphertextShape — an RSA chunk must be the modulus size, an ed25519 chunk must be an X25519 sealed box (magic + min length). Replaces the coarse 16-byte floor; a plaintext value smuggled under a fingerprint is rejected offline. - lint duplicate detection (Codex/Gemini P2): now flags a key present in two scopes of a stack, or in both a scope and the legacy secrets.yaml (mode A) — the 'one key, one mode' guarantee, caught before deploy instead of hard-failing there. - Docs: fixed the spec CLI block (secrets scope <verb>, dropped nonexistent edit/ updatekeys, --key -> --key-file) + integrity section (stack + chunk-index binding, downgrade + shape gates); added a user-facing 'Per-scope secrets' section to secrets-management.md; added a v1 supersession note to the keyless-secrets RFC README. - Tests: real CLI harness (set/get/list/lint/doctor/disallow + reconcile-fail-closed + undeclared-scope refusal); SameRecipients; Delete; passphrase-protected-key rejection; RSA scope/key transplant; cross-stack transplant (ed25519 + RSA). All secrets + cmd_secrets + provisioner suites green; module builds; verified end-to-end (committed file carries stack:, no plaintext, CI-path get via SC_SCOPE_KEY, lint clean). Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
Round-2 architecture review (Codex + Gemini + Claude crypto lens) surfaced two real
weaknesses in direct-per-recipient value sealing, and 2 of 3 reviewers judged envelope
encryption materially better (not theater). Both weaknesses are structurally eliminated
by envelope, so switch to it while the format is still unshipped.
Weaknesses fixed:
- Cross-version RSA chunk splicing (Gemini): the OAEP chunk label bound index+count but
no per-encryption id, so chunks from two commits of the same key (same aad/idx/count)
could be mixed into a hybrid plaintext. Gone: values are no longer RSA-chunked.
- Per-recipient plaintext divergence (Claude lens): independent per-recipient seals let
an actor with repo write + a recipient pubkey seal a DIFFERENT value into one
recipient's slot (e.g. poison the prod deployer while local-decrypt review sees benign).
Gone: one shared value ciphertext, so a tampered slot fails to decrypt, never diverges.
Design: each value is EncryptedValue{ciphertext, wraps} — the value is AEAD-encrypted
once under a random 32-byte DEK (ChaCha20-Poly1305, AAD = stack\0scope\0key via new
ciphers.SealAEAD/OpenAEAD/GenerateDEK), and the DEK is wrapped per recipient with the
existing RSA-OAEP / X25519 path (32 bytes = single block, so NO chunking remains). This
also deletes the whole RSA chunk-framing surface prior rounds kept patching, and makes
reseal a DEK-rewrap. No new dependency (age was the alternative but reverses the
no-new-crypto-deps stance); the whole-file store is untouched.
Tests: TestEnvelope_NoPerRecipientDivergence (tampered wrap -> decrypt failure, not a
chosen value); all prior transplant/downgrade/lint/CLI/provisioner suites updated to the
envelope shape and green. Verified end-to-end (file carries ciphertext+wraps, no
plaintext, CI-path get, lint clean). Spec updated to the envelope design.
Signed-off-by: Dmitrii Creed <creeed22@gmail.com>
smecsia
left a comment
There was a problem hiding this comment.
Review: scoped-store v1
Strong, security-conscious work — reviewed the full crypto/store/resolve/governance/CLI/provisioner code, built it, ran the secrets test suites (all pass), CI is 22/22 green. No blocking issues; a few minor notes inline.
The security design holds up (verified, not just asserted)
- Envelope encryption: value AEAD-sealed once under a random DEK, DEK wrapped per recipient → no per-recipient divergence, no chunk-splicing.
- AAD binds
(stack, scope, key)into both the value AEAD and each DEK wrap → transplant resistance across scope/key/stack (covered byTransplantResistance_*tests). - Downgrade guard refuses legacy forgeable ed25519 blobs under an AAD binding (
TestEd25519Downgrade_RejectedUnderAAD). - RSA-OAEP label binds chunk idx+count → reorder/drop/splice fails (defense-in-depth; the DEK is single-chunk anyway).
- Deploy resolution is key-driven — the PR clamp is cryptographic, no config surface to subvert; hard-fails on tamper/ambiguity via
ErrScopedIntegrity, correctly propagated pastIgnoreSecretsMissing. - Fail-closed schema-version guards, filename↔scope + dir↔stack binding checks, all-or-nothing reseal, atomic writes (files before
scopes.yaml), andaad=nilreproduces the legacy wire format byte-for-byte.
The test suite specifically targets the adversarial paths (transplant, splice, downgrade, cross-scope ambiguity, version fail-closed, plaintext-chunk lint, envelope no-divergence) — nice.
Non-blocking note on scope/title
The PR title (docs(secrets): … spec) undersells this — it's a full ~3k-line implementation, not a spec. Suggest renaming to feat(secrets): scoped-store v1 … before merge so history/changelog reflect that.
Questions
- Is
sc secrets scope lintenforced as a required status check? A couple of the safety properties (no legacy/scope key collision, no cross-scope duplicates) are only caught by lint — see the inline note on the deploy-time merge. - v1 ships age/SSH recipients only (KMS/OIDC = v2). Is there a rotation runbook for a leaked scope key beyond
disallow+ the "rotate values" warning (which correctly notes git history isn't rewritten)?
| } | ||
| for k, v := range scopedVals { | ||
| if _, exists := desc.Values[k]; exists { | ||
| continue // legacy whole-file store wins; `sc secrets scope lint` forbids duplicates |
There was a problem hiding this comment.
Silent legacy-shadows-scoped at deploy. When the same key exists in both the legacy whole-file store and an openable scope, the legacy value wins here silently — no error, no warning. This is the fail-safe direction (an attacker who can only write a scope file can't override a legacy secret), so it's not a vuln. But it's a foot-gun: an operator who sets a scoped value expecting it to take effect gets the legacy value with no signal.
sc secrets scope lint flags this collision — but only helps if lint is a required CI check. Consider (a) confirming lint is required, and/or (b) emitting a deploy-time warning when a scoped key is shadowed by legacy.
| if err := os.MkdirAll(filepath.Dir(path), 0o755); err != nil { | ||
| return errors.Wrapf(err, "failed to create directory for %s", path) | ||
| } | ||
| tmp := path + ".tmp" |
There was a problem hiding this comment.
Fixed path + ".tmp" temp name means two concurrent writers to the same scope file would collide on the temp file. Non-issue for serial CLI/CI use, but os.CreateTemp(filepath.Dir(path), …) would be more robust. (Only ciphertext ever hits disk, so no plaintext-leak concern here.)
| if err != nil { | ||
| return "", errors.Wrap(err, "failed to read value from stdin") | ||
| } | ||
| return strings.TrimRight(string(data), "\n"), nil |
There was a problem hiding this comment.
strings.TrimRight(data, "\n") strips all trailing newlines from stdin. Piping a PEM/cert (sc secrets scope set TLS_KEY - < key.pem) silently drops its trailing newline, which can break exact-match consumers. Consider trimming a single trailing \n (e.g. strings.TrimSuffix), or documenting the behavior on the command help.
| // public key (e.g. "SHA256:abc…"). It is used as the per-recipient map key inside a | ||
| // scope file: readable, order-independent, and derivable from a private key so a | ||
| // decryptor can find its own slot. | ||
| func recipientFingerprint(authorizedKey string) (string, error) { |
There was a problem hiding this comment.
Minor: recipientFingerprint parses via ssh.ParseAuthorizedKey, while the wrap/validate paths (validateEncryptableRecipient, encryptForRecipients) parse the same key via ciphers.ParsePublicKey(secrets.TrimPubKey(...)). Two parse paths for one key — they agree for standard ssh-rsa/ssh-ed25519 keys, so this is just a "two sources of truth" robustness note, not a bug.
What
Implementation spec for the keyless-secrets RFC's Minimal v1: per-scope secret files (
secrets.pr.yaml,secrets.staging.yaml,secrets.prod.yaml, …) with per-scope recipient keys, so a CI context holds a key that opens exactly the scope files it needs — replacing the single masterSIMPLE_CONTAINER_CONFIGthat decrypts everything.Spec:
docs/design/keyless-secrets/scoped-store-v1.md. Highlights:.sc/scopes.yamlis the single scope→recipient surface;sc secrets lintfails on SOPS-metadata drift + plaintext leaks.${secret:}resolution (scope file → legacy store), hard-fail on missing/undecryptable — never a partial deploy.pull_requestjobs (previews, provision-preview) getSC_KEY_PRonly and stop receivingSC_CONFIG— closes the biggest remaining PR-reachable secret exposure.Process
Draft for design review (panel review planned like Phase 1). Implementation lands in this same PR after sign-off — kept as one consolidated PR per maintainer preference.
RFC: #346 (merged). ClickUp: 86caf67c3.